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Anthony  V.  Fiacco 
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1.  Introduction 

For  a general  class  of  parametric,  twice  continuously  differentiable 
nonlinear  programming  problems,  Fiacco  (1976)  obtained  a theoretical  basis 
for  characterizing  the  differentiability  properties  of  a local  solution  and 
the  associated  optimal  Lagrange  multipliers,  with  respect  to  parametric 
variation,  and  established  the  use  of  the  usual  Lagranglan  to  calculate 
exact  sensitivity  Information  and  a penalty  function  to  estimate  sensitivity 
information.  Armacost  and  Fiacco  (1975)  obtained  first-  and  second-order 
changes  In  the  optimal  value  function  and  pursued  the  computational  aspects 
of  computing  the  first-order  changes  of  a Kuhn-Tucker  triple.  Formulas  were 
obtained  for  calculating  these  changes  when  a Kuhn-Tucker  triple  Is  known, 
and  for  estimating  these  changes  when  a penalty  function  minimum  Is  known. 

Armacost  and  Fiacco  (1976a)  further  refined  these  results  for  the  problem 
where  the  parameters  are  restricted  to  be  the  right-hand  side  components  of 
the  constraints. 

The  procedure  developed  by  Fiacco  (1976)  was  Implemented  computationally 
by  Armacost  and  Mylander  (1973)  with  computational  experience  reported  by 
Armacost  and  Fiacco  (1974).  The  computational  procedures  were  extended  and 
additional  computational  experience  reported  by  Armacost  (1976a)  and  by 
Armacost  and  Fiacco  (1976b).  Recently,  Buys  and  Gonin  (1975),  paralleling 
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. the  cited  results  developed  by  way  of  usual  penalty  functions,  obtained 

the  first-order  sensitivity  results  for  the  Kuhn-Tucker  triple  and  the 
I optimal  value  function  in  terms  of  an  augmented  Lagranglan.  Armacost 

(1976b)  subsequently  used  the  approach  developed  by  Armacost  and  Fiacco 
(1975)  to  develop  these  sensitivity  results  and  extend  the  penalty  function 
approximation  procedure  to  a more  general  type  of  sequential  algorithm. 

This  paper  presents  a much  simpler  development  of  the  sensitivity 
results  using  augmented  Lagranglans  (obtained  independently  by  Buys  and 
Gonin  (1975))  following  the  direct  approach  of  Fiacco  (1976).  We  also  show 
that  these  results  are  exact  and  equivalent  to  the  sensitivity  computations 
developed  by  Armacost  and  Fiacco  (1975)  when  a Kuhn-Tucker  triple  is 
known.  As  indicated,  this  is  based  in  part  on  material  in  the  dissertation 
by  Armacost  (1976b). 

In  Section  2,  we  review  the  supporting  theory  from  Fiacco  (1976)  and 
Armacost  and  Fiacco  (1975).  In  Section  3,  we  present  the  necessary  back- 

. I 

ground  on  augmented  Lagranglans.  In  Section  4,  we  develop  the  sensitivity 
results  using  augmented  Lagranglans,  and  in  Section  5 show  their  exactness 
and  equivalence  to  the  computed  expressions  for  the  case  when  a Kuhn-Tucker 
triple  is  known.  Several  conclusions  and  extensions  are  noted  in  Section  5. 

2.  Supporting  Theory 

The  parametric  mathematical  programming  problems  considered  here  are 
of  the  form 

f (x,e) 

g^(x,e) 
hj(x,e) 

where  x is  the  usual  vector  of 

I 

1^  numbers  called  "parameters."  It 

I 

I 

■J 


^0,  i l,...,m,  P(£) 

=0,  j=l,...,p,  ^ 

variables  and  e is  a k-component  vector  of 
is  desired  ultimately  to  develop  a complete 
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characterization  of  a solution  x(e)  of  Problem  P(e)  as  a function  of  e 
In  our  current  work,  we  have  concentrated  on  certain  recently  computationally 
tractable  measures  of  change  In  a solution  as  e is  perturbed  from  a spe- 
cified value.  (Without  loss  of  generality,  we  assume  that  the  specified  value 
is  e = 0 .) 

The  Lagrangian  for  Problem  P(e)  Is  defined  as 

m p 

L(x,u,w,e)  = f(x,e)  - I u g (x,e)  + I w h.(x,e)  , (1) 

1=1  ^ ^ j=l  ^ ^ 

where  u^  , 1 = l,...,m  and  w^  , j = l,...,p  are  "Lagrange  multipliers" 

associated  with  the  inequality  and  equality  constraints,  respectively.  Any 
vector  (x,u,w)  satisfying  the  usual  (first  order)  Kuhn-Tucker  conditions 
(Fiacco  and  McCormick,  1968)  of  Problem  P(e)  is  called  a Kuhn-Tucker  triple. 

The  following  four  assumptions  are  sufficient  to  establish  the  results 
and  are  assumed  to  hold  throughout  the  paper : 

A1  — The  functions  defining  Problem  P(e)  are  twice  continuously 
differentiable  in  (x,e)  in  a neighborhood  of  (x*,0)  . 

A2  — The  second  order  sufficient  conditions  for  a local  minimum 

of  Problem  P(0)  hold  at  x*  with  associated  Lagrange  multi- 
pliers u*  and  w*  . 

A3  — The  gradients  V^g^(x*,0)  for  all  1 such  that 

g (x*,0)  = 0 , and  V h (x*,0)  , j = 1 are  linearly 

1 X J 

independent. 

A4  — Strict  complementary  slackness  holds  at  (x*,0)  (l.e., 
u*  > 0 for  all  i such  that  g^(x*,0)  = 0 ) . 
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Lemma  1:  (Local  characterization  of  a Kuhn-Tucker  triple,  Flacco, 
(1976)  of  Problem  P(€)).  If  Assumptions  Al,  A2,  A3  and  A4  hold  for  Problem 
P(e)  at  (x*,0)  , then 


(a)  X*  is  a local  Isolated  minimizing  point  of  Problem  P(0) 
and  the  associated  Lagrange  multipliers  u*  and  w*  are 
unique ; 

(b)  for  e in  a neigl.borhood  of  0 , there  exists  a unique, 
once  continuously  differentiable  vector  function 

T 

y(e)  » (x(e) ,u(e) ,w(e))  satisfying  the  second  order 
sufficient  conditions  for  a local  minimum  of  Problem  P(e) 

T 

such  that  y(0)  = (x*,u*,w*)  = y*  and  hence,  x(e)  is  a 

locally  unique,  local  minimum  of  Problem  P(e)  with  associated 
unique  Lagrange  multipliers  u(e)  and  w(e)  ; and 

(c)  for  e near  0 , the  set  of  Mn  ’ Inequalities  is 
unchanged,  strict  complementary  slackness  holds  for  u^(e) 

for  1 such  that  g^(x(e),e)  = 0 , and  the  binding  con- 
straint gradients  are  linearly  independent  at  x(e)  . 

This  result  provides  a characterization  of  a local  solution  of  Problem 
P(e)  and  its  associated  optimal  Lagrange  multipliers  near  e = 0 . it  gen- 
eralizes a theorem  first  presented  by  Flacco  and  McCormick  (1968,  Theorem  6) 
and  is  closely  related  to  a generalization  of  the  same  theorem  provided 
independently  by  Robinson  (1974).  It  shows  that  the  Kuhn-Tucker  triple  y(e) 
is  unique  and  well  behaved,  under  the  given  conditions.  Since  y(e)  is 
once  differentiable,  the  partial  derivatives  of  the  components  of  y(e)  are 
well  defined.  This  fact  and  Assumption  Al  also  mean  that  the  functions  defining 
Problem  P(e)  are  once  continuously  differentiable  functions  of  e along  the 
"solution  trajectory"  x(e)  near  e = 0 , and  the  Lagranglan  is  a once  con- 
tinuously differentiable  function  of  e along  the  "Kuhn-Tucker  point  trajectory." 


I 


i 


We  are  thus  motivated  to  determine  a means  to  calculate  the  various 
partial  derivatives,  since  this  yields  a first  order  estimate  of  the  locally 

optimal  Kuhn-Tucker  triple  and  the  problem  functions  near  e » 0 . 

1 
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Denote  by  V x(e)  = (3x  (e)/3£.)  , 1 = , j = , the 

£ 1 J 

n X k matrix  of  partial  derivatives  of  x(£)  with  respect  to  £ , and 

define  V u(e)  and  V w(£)  in  a similar  fashion.  We  then  define 
£ £ 

= (VgX(£) , V^u(£) , V^w(e))^  , an  (n-hn+p)  x k matrix. 

When  y(£)  is  available,  7^y(£)  can  be  calculated  by  noting  that 

Conclusion  (b)  of  the  theorem  implies  the  satisfaction  of  the  Kuhn-Tucker 
conditions  for  P(e)  at  y(£)  near  £ = 0 , i.e. , 


V^L(x(£) ,u(£) ,w(£) ,£)  = 0 , 

(2) 

U^(e)gj,(x(£)  ,£)  = 0 , 

i 1.  y • • • yin  y 

(3) 

hj(x(£),e)  = 0 , 

j = 1 , . • . y p • 

(4) 

Since  the  Jacobian,  M(£)  , of  this  system'  with  respect  to 

(x,u,w)  (i.e.. 

the  matrix  obtained  by  differentiating  the  left  side  of  (2)  - (4)  with 
respect  to  the  components  of  (x,u,w)  ) is  nonsingular  under  the  given 
assumptions,  the  total  derivative  of  the  system  with  respect  to  e is  well 
defined  and  must  equal  zero.  This  yields 

M(£)7^y(e)  = N(£)  , (5) 

where  N(£)  is  the  negative  of  the  Jacobian  of  the  Kuhn-Tucker  system 
with  respect  to  £ , and  hence 

^j.y(e)  = M(£)  ^N(£)  . (6) 

The  next  result  applies  this  theory  to  an  analysis  of  the  optimal 
value  function  of  Problem  P(£)  along  the  Kuhn-Tucker  point  trajectory 

[x(£) ,u(e) ,w(£)]  . 

The  optimal  value  function  is  defined  as: 

f*(£)  = f[x(£),e]  , 
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T 

As  noted  following  Theorem  1,  when  y(e)  = [x(e) ,u  (e),w(e)] 

T T T T 

is  available,  V^y(e)  = [V^x(e) , V^u(e) ,V^w(g) ] can  be  calculated.  We 

briefly  recapitulate  and  then  analyze  various  cases  in  some  detail. 

Recall  that  the  Kuhn-Tucker  first-order  necessary  conditions  satisfied  by 
y(e)  for  Problem  P(e)  are 


m 

V L(y(G),£)  = 7 f(x(e),e)  - T.  u (e) V g (x(e)  ,e) 


N(c)  = (-V  ^g. h ,...)  and  let 

ex  i e 1 e J 
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M(e)  = 


T 

-V  . 

X i 


T 

..  V h: 

X j 


Since  the  system  (2)  - (4)  Is  identically  satisfied  for  e near  0 , it  can 
be  differentiated  with  respect  to  e to  obtain 

M(e)V^y(e)  - N(e)  . (5) 

Under  the  conditions  of  Theorem  1,  M(e)  has  an  inverse,  thus 

V^y(e)  = M(e)  ^N(e)  ; (6) 

M(e)  is  an  n+m4p  square  matrix  and  V^y(e)  and  N(e)  are  n+mfp  x k 
matrices.  ’ i 

Clearly,  any  method  of  solving  (5),  a system  of  linear  equations, 
is  satisfactory  and  M(e)  need  not  be  Inverted  as  in  (6).  However,  under 
the  given  assumptions  the  work  involved  in  calculating  M(e)  ^ can  be 
significantly  reduced,  as  will  become  evident.  Assume  throughout  this 
section  that  the  conditions  of  Theorem  1 hold,  and  suppose,  without  loss 
of  generality,  that  the  first  r inequality  constraints  are  binding.  Let 
M(e)  be  defined  as  follows: 


T-349 


M(e)  = 


"v^L  • 

X 1 

T 

-Vl 

n T 1 

...  -7  g 

X r 1 

T T 

7 h,  ...  7 h 

X 1 X p 

“I'xh ; 

h 

1 

0 

“rVr  ; 

; 

^ K I 

X 1 1 

: 1 

1 

\ 

0 1 

0 

7 h ' 

X p 1 

1 

1 

Then,  rearranging  the  rows  and  columns  corresponding  to  the  last  m - r 
(nonbinding)  inequality  constraints,  it  follows  that 


M(e)  = 


M(e) 

0 

0 
.T 


-Vr+1- 


g ■•■1 
®r 


T 

X m 


8. 


m 


Let  y(e)  * (x  (e)fU(Q)  ,w(e))  md  N(c)  = ’'*’’”'*l'^e  *1’***’ 

XT  “■  T 

-V  h ...)  , where  i *=  1, . . . ,r  , j = 1, . . . ,p  , and  u(c)  - (uj^Ce) , . . . ,u  (e) ) 

Since  gj^(x(0),0)  >0  , i=r+l m , complementary  slackness  and  continuity 

imply  that  u^(e)  = 0 , 1 = r + l,...,m  for  e near  0 . Hence,  the  corres- 
ponding components  of  V^y(e)  • are  zero,  i.e.,  * 0 , i “ r + l,...,m  , 

and  it  follows  that  (5)  may  be  reduced  to  solving  the  system 

M(e)V  7(e)  = N(£)  . (9) 

e 

Let  g H (gj^ g^)^  , G E diag(g^)  and  U H diag(u^)  , i - l,...,r  , 

T „ T,T  , „ r™  T 


V g - [Vgi\...,VgM"^  and  7 h = [7h| Th'^]’^  . Since  G - 0 , it  follows 

X i t • X , X p 


that 
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“ “ 

2 T-  T 

10  0 

V^L  -V  g V^h 

X X X 

M(e)  * 

0 -U  0 

0 0 I 

• 

-V  g 0 0 

X 

7 h 0 0 

. a 

X 

Thus 


M(e) 


7^  -7^i  7\ 

-1 

o 

O 1 

X X X 

-7  g 0 0 

• 

o 

1 

H 

o 

X 

7 h 0 0 

X 

0 0 ij 

(10) 


Thus  far,  we  have  assumed  no  special  structure  on  the  problem  or 

the  nature  of  M(e)  . However,  In  order  to  make  further  progress  in  cal- 

— -1  2-1 
culatlng  M(e)  , consider  several  special  cases:  (1)  V^L  exists; 


(2)  v;l 

2 2-1 

and  (4)  r + p<n,VL?‘0  and  V L does  not  exist 

X X 


0 ; (3)  there  are  n linearly  Independent  constraint  gradients; 

Let 

P 2 ^ matrix.  Let  M(e)  ^ denote  the  left 

matrix  on  the  right-hand  side  of  Equation  (10) , thus 


M(e) 


2 T 

V L P 

X 

-**  0_ 


(11) 


Now  suppose  that 


M(e) 


11 

^12 

‘21 

^22 

(12) 


Our  task  Is  to  determine  the  A 


Ij 


for  the  various  cases. 
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2 -1 

Case  1:  V L exists. 

X 


It  is  easily  shown  that 
A 


2-1  T 2 -1  T -1  2 -1 

= V^L  [I-P  (P7  L ^P^)  ^PV  L ] 
11  X X X 


-1 


(13) 


, .T  _2  -1  T.„„2  -1„T] 
A, o = = V L P [PV  L P ^ 

12  21 J X X 


2 -1  T -1 

Note  that  ^ 1 exists  by  our  assumptions. 


Case  2:  V L = 0 . 

X 

There  are  two  possible  situations:  there  are  r + p < n , or 
r + p = n linearly  independent  binding  constraint  gradients.  If  there  are 
less  than  n , Assumption  A2  is  violated  and  it  is  easily  seen  that 

M(e)  ^ does  not  exist.  For  example,  this  corresponds  to  the  situation 
characterizing  a degenerate  solution  in  linear  programming.  In  this  case, 

not  be  differentiable.  We  shall  not  pursue  this  possibility  fur- 
ther here.  When  there  are  n linearly  Independent  binding  constraint 
gradients,  we  have  a special  Instance  of  Case  3,  which  is  developed  below. 

Case  3:  There  are  n linearly  independent 
binding  constraint  gradients. 

The  Jacobian  of  the  n constraints  with  respect  to  tha  n variables 

T.-l 


is  non-vanishing.  It  is  easily  shown  that  in  (11) , (P  ) 
Hence, 

I I 

*11  • 0 

A = a"^  = P~^ 

12  ^21 

-T  2 -1 

A^-  = -P  ^ 7 L P . 

22  X 


-T 

P exists. 


(14) 
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2 _i 

Note  that  if  exists,  (14)  Is  computable  from  (13);  however,  here 

2 -1 

the  existence  of  V^L  Is  not  assumed.  Also,  the  remaining  Case  2 

T -1 

possibility  mentioned  above  gives  ^ " ^21  " ^ 

Note  also  that  the  standard  linear  programming  problem  nondegenerate 
solution  case  falls  Into  this  latter  category,  with  n linearly  Independent 

2 

binding  constraint  gradients  and  V^L  = 0 . 

2 2-1 

Case  4:  r + p<n,7Lj*0  and  V L does  not  exist. 

X X 

This  represents  Che  more  general  situation  and  Is  treated  In  detail 

here.  In  Equation  (9)  M(e)  Is  an  n+r+p  square  matrix.  By  Assumption  A3, 

the  7 g , 1 = l,...,r  , and  the  7 h , j » l,...,p  , are  linearly 
XI  ^ J 

independent.  By  Assumption  A2  the  second-order  sufficient  conditions  for 
a local  minimum  are  satisfied.  Thus,  the  Hessian  of  the  Lagranglan  Is 
positive  definite  with  respect  to  those  nonzero  vectors  orthogonal  to  the 
binding  constraint  gradients  and  hence 

T 2 

z 7^L  z > 0 , for  all  z ^ 0 such  that 

P z = (-7^i,  7^h)''^  z = 0 . (15) 

Under  the  assumption  of  linear  Independence  of  the  binding  constraint 
I gradients,  P has  rank  r + p , and  without  loss  of  generality,  assume  that 
the  submatrix  involving  the  first  r + p columns  of  P Is  nonsingular. 
Therefore,  we  partition  P as  follows: 

' ' p - (p„.  Pj)  . 

where  P^  is  an  (r4p)  x (r+p)  nonsingular  matrix  and  P^  is  an 
(r+p)  X q matrix  where  q = n-r-p  . It  is  easily  shown  that  the  matrix 
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is  positive  definite. 


This  fact  leads  us  to  a representation  of 


M(e) 


Using  the  partitions  defined  above. 


and 


where 


and 


22 


2 

= 

21  X 


(16) 


(17) 


This  representation  of  the  Inverse  of  M(e)  is  due  to  McCormick  (1975). 

''  -1 

We  have  now  developed  analytical  expressions  for  M(e)  for  all  cases 
that  can  occur,  and  now  turn  to  the  calculation  of  V y(e)  , using  the  block 

* -1 

components  of  M(e) 


Careful  attention  to  the  algebra  yields  the  following  expression  for 
the  first-order  sensitivity  of  a Kuhn-Tucker  triple  as 


12  - 


I 

I 


t 
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V^y(e) 


V x(e) 
e 

V^u(e) 

V w(e) 
e 


where 


N(e)  = 


-V  \ 
ex 

V,i 

-V  h 

t 


H(e)‘^N(e)  (18) 


Equation  (18)  holds  for  the  general  problem  whenever  Equation  (9)  Is 
well  defined,  e.g.,  when  the  conditions  of  Theorem  1 hold.  For  the  particular 
cases  treated  earlier  In  this  section,  V^y(e)  may  be  calculated  from  (18) 
by  first  evaluating  the  as  given  In  (13),  (14),  or  (17),  depending  on 

which  conditions  apply. 

Equivalent  results  will  be  obtained  In  Section  4 using  augmented 
Lagranglans. 

3.  Background  on  Augmented  Lagranglans 

As  noted  In  the  Introduction,  Flacco  (1976)  established  a basis  for 
estimating  the  sensitivity  of  a Kuhn-Tucker  triple  when  the  Kuhn-Tucker  triple 
Is  not  known  but  can  be  estimated  using  a penalty  function.  Penalty  function 
algorithms  belong  to  a more  general  class  of  algorithms  whereby  the  constrained 
problem  Is  transformed  Into  a sequence  of  unconstrained  problems  by  means  of 

I 

an  auxiliary  function. 

Motivated  In  part  b)^  an^  effort  either  to  "regularize"  the  usual 
Lagranglan  (1)  or  to  overcome  the  111-condltlonlng  typically  associated  with 
the  traditional  penalty  function  procedures,  several  other  classes  of  related 
auxiliary  functions  have  recently  received  considerable  attention.  These 
Include  exact  penalty  functions  and  generalized  Lagranglans  [Arrow,  Gould  and 
Howe  (1973)1  and  augmented  Lagranglans  [Hestenes  (1969)  and  Buys  (1972)]. 
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The  functions  are  similar  in  structure  and  derive  their  different 
names  largely  from  the  viewpoint  taken  in  their  development.  The  following 
description  gives  a general  idea  of  the  different  approaches  applied  to  a 
"standard"  NLP  problem  of  the  form  P(e)  , i.e. , where  the  parameter  e 
is  not  present. 

Generalized  Lagrangidns  generally  refer  to  more  general  forms  of  the 
usual  Lagranglan  (1)  associated  with  Problem  P (suppressing  e throughout). 
Typically,  f , g^  , h^  , u^  or  w^  are  replaced  by  functions  of  these 

quantities.  The  object  is  to.  structure  the  new  function  so  that  it  behaves 
"like"  a Lagranglan  (e.g.,  in  characterizing  optimality)  but  acquires  certain 
desirable  properties  (e.g.,  convexity  or  strict  convexity)  that  (1)  does 
not  possess.  A popular  extension  Introduces  generalized  Lagrange  multipliers 
that  may  also  be  functions  of  x . Augmented  Lagranglans  are  usually  formed 
by  adding  a penalty  term  to  the  usual  Lagranglan,  though  this  class  can 
easily  be  enlarged  by  using  the  generalized  Lagranglans  described  earlier 
in  the  paragraph.  In  this  context,  the  term  "Method  of  Multipliers"  refers 
to  a particular  approach  using  a particular  form  of  the  augmented  Lagranglan. 
The  term  is  due  to  Hestenes  (1969)  who  proposed  an  algorithm  based  on  sequen- 
tially Improving  the  estimates  of  the  Lagrange  multipliers. 

A penalty  function  would  be  considered  exact  if,  for  particular  values 
of  certain  parameters,  an  unconstrained  (local)  minimum  of  the  function 
(locally)  solves  the  given  programming  problem.  Since  the  "optimal  value" 
of  these  parameters  is  generally  unknown,  they  must  be  estimated,  and  since 
these  parameters  often  correspond  to  Lagrange  multipliers,  the  result  is 
that  augmented  Lagranglan  and  exact  penalty  function  algorithms  are  quite 
similar  in  spirit. 

It  is  much  easier  to  deal  with  equality  constraints  in  these  methods. 
When  considering  inequality  constraints,  penalty  functions  may  be  Introduced 
for  which  higher  than  first  order  differentiability  is  not  inherited  from 
the  problem  functions,  and  which,  in  some  cases,  are  only  piecewise  differ- 
entiable. 
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Arrow,  Gould,  and  Howe  (1973)  develop  a saddle  point  theory  for  an 
"extended"  Lagranglan,  defining  multiplier  functions  with  certain  limiting 
properties.  Realizations  of  their  extended  Lagranglan  yield  many  of  the 
augmented  Lagranglans  and  exact  penalty  functions  found  In  the  literature. 

As  Indicated,  a primary  motivation  for  the  study  of  augmented 
Lagranglans  and  exact  penalty  functions  has  been  to  overcome  the  problems 
of  ill-conditioning  associated  with  ordinary  penalty  functions.  Because 
of  this,  considerable  effort  has  been  made  to  investigate  the  computational 
aspects  of  these  algorithms  [Bertsekas  (1975),  Miele  et  aL  (1972),  Rockafellar 
(1973),  and  Rupp  (1976)].  Buys  (1972)  provided  perhaps  the  best  detailed 
analysis  of  a dual  approach,  using  an  augmented  Lagranglan  first  Introduced 
by  Rockafellar  (1970).  Recently,  Buys  and  Gonln  (1975)  used  the  same 
approach  to  obtain  sensitivity  analysis  results  in  terms  of  the  same  augmented 
Lagranglan  formulation.  Since  our  Intent  is  to  redevelop  these  sensitivity 
results,  further  descriptions  of  the  computational  approaches  are  not  Included.  ! 

In  the  Interest  of  keeping  the  focus,  the  specific  augmented  Lagranglan 
' used  by  Buys  and  Gonln  (1975)  will  be  utilized  here.  It  is  noted,  however, 

that  the  approach  is  directly  applicable  to  the  general  form  of  the  extended 
Lagranglan  given  by  Arrow,  Gould,  and  Howe,  thus  encompassing  most  of  the 
currently  popular  augmented  Lagranglans  and  exact  penalty  functions.  j 

I 

4.  Augmented  Lagranglans  and  Sensitivity  Analysis 

Rather  than  use  the  dual  approach  of  Buys,  the  sensitivity  results  1 

I which  obtain  using  augmented  Lagranglans  follow  directly  by  considering  the 

: equations  which  are  satisfied  at  a solution  point,  as  In  Lemma  1.  The  key  I 

I ’ 

I point  in  the  following  development  Is  that  the  gradient  of  the  augmented 

I t ' 

j Lagranglan  is  equal  to  the  gradient  of  the  ordinary  Lagranglan  near  a solu-  i 

I tion  point,  when  the  problem  parameter  e is  perturbed.  ? 


Let  c > 0 be  a constant,  J = {1:  u^^  - cg^(x,0)  ^0  , l«l,...,m} 
And  K = {1:  u^  - cg^(x,0)  < 0,  1=1,..., m}  . Assume  that  Assumptions 
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Al,  A2,  A3  and  A4  hold  for  Problem  P(e).  Let  x*  be  a local  minimum  of 
Problem  PCO)  with  associated  Lagrange  multipliers  u*  and  w*  . Let 

B*(0)  = {1:  g^(x*,Q)  = 0 , 1=1,..., m}  . Then,  for 

1 e B*(0)  , u^*  - cg^(x*,0ll  = 'u^*  > 0 and  for  i = l,...,m  i B*(0), 

g^(x*,0)  > 0 and  hence  u^*  - cg^(x*,0)  = -cg^(x*,0)  < 0 . Thus  at  the 

solution  point,  J Is  defined  with  strict  inequality  and  corresponds  to 
B*(0)  and  K contains  all  1 such  that  g^(x*,0)  > 0 . 

The  augmented  Lagranglan  Is  defined  as 

0(x,u,w,e,c)  = f(x,e)  - 1 (u^-  ^g^(x,e))g^(x,e) 

(19) 

''  1 2 
+ T.  (w.+  -^h.(x,e))h.(x,e)  - (l/2c)  Z u. 

j=i  ^ J ^ leK  ^ 

The  gradient  of  0 taken  with  respect  to  x is 

P 

V 0(x,u,w,e,c)  = V L + Z eg  V g.  + Z ch.7  h,  . (20) 

IeJ  ^ ^ j=l  J * 3 

At  (x,u,w,e,c)  = (x*,u*,w*,0,c) , since  g^(x*,0)  = 0 , 1 e J , and  hj(x*,0)  = 0 
for  all  j , It  follows  that 


Vx0 (x* , u* , w* , 0 , c) 


7^L (x* , u* , w* , 0) 


(21) 


The  augmented  Lagranglan  is  twice  continuously  differentiable  in  x except 
at  points  where  u^  - cg^(x,£)  = 0 . By  Assumption  A4,  u^*  - cgj^(x*,0)  # 0 

for  all  1 and  hence,  0 is  twice  continuously  differentiable  for 
(x,u,e)  near  (x*,u*,0)  . Differentiating  (20)  with  respect  to  x yields 

V^^0(x,u,w,E,c)  = + Z cg^V^^g^  + Z 


P 2 P T 

+ Z ch,7  h.  + Z c7  h/7  h,  . 
j.l  J X J X j X j 


(22) 
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* 


Again,  at  (x,u,w,e,c)  = (x*,u*,w*,0,c)  , 


V ^®(x*,u*,‘W*,0,c)  = V \*  + c E V g^'^V  g* 

X X . T X i X i 

ieJ 


r Xrp  ^ 

+ c I V h h 
j=l  ^ J * J ’ 


(23) 


functions  at  x*,u*,w*,0  . 
, Then, 

that 


it  follows  that 

V ^0  = V + cP^P  . (24) 

X X 

2 

Since  L is  positive  definite  for  all  z 0 such  that  Pz  = 0 

by  Assumptions  A2  and  A^  , then  for  all  c sufficiently  large  it  follows 
2 

easily  that  V^0  is  positive  definite  near  (x*,u*,w*,0,c)  . Thus,  there 

2 

is  a number  c*  > 0 such  that  for  c > c*  , 7^  0(x*,u*,w*,O,c)  is  positive 

definite  (and  hence  nonslnguls^r) . Assume  that  c > c*  and  the  notation  of 
Section  2 in  the  remainder  of  this  section.  Recall  that  the  first  r Inequality 
constraints  are  assumed  binding,  the  superbar  indicating  evaluation  at 
i = l,...,r  . For  convenience  we  shall  use  a bar  underscore  to  denote  evalu- 
ation at  1 = r + l,...,m  . The  main  result  can  be  stated  as  follows. 

Theorem  1.  (Sensitivity  results  using  an  augmented  Lagranglan 
for  Problem  P(e)). 

If  Assumptions  Al,  A2,  A3  and  A4  hold  for  Problem  P(e) , then  for  e 
near  0 and  c > c*  , there  exists  a unique,  once  continuously  differentiable 

vector  function  y^(e.c)  = (x(e ,c),iI(E,c) ,w(e,c) ,u(e,c))^  satisfying 


where  the  exponent  * denotes  evaluation  of  the  given 
Without  loss  of  generality,  assume  that  J = B*(0)  = 
recalling  (see  Section  2,  just  prior  to  Equation  11) 
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:,u,w,e,c) 

» 0 , 

(25) 

u^g^(x,e) 

= 0 , 

i , 

(26) 

hj(k,e) 

= 0 , 

J ® , 

(27) 

u^g^(x,e) 

= 0 , 

i “ ir^  1 y y in  y 

(28) 

T T 

with  (x(e,c),u(e,c),w(e,c))  * (x(e),u(e),w(e))  H y(e)  (the  Kuhn-Tucker 

triple  of  Lemma  1)  and  such  that  for  any  e near  0 and  c > c*  , 
x(e,c)  is  a locally  unique  unconstrained  local  minimizing  point  of 
0lx,u(e,c),w(e,c),e,c]  and  V^0[x,e,c]  is  positive  definite  for  (x,u,w) 
near  (x*,u*,w*)  . 

Proof. 


The  Jacobian  matrix  of  Equations  (25)  - (28)  taken  with  respect  to 
(x,u,w,u)  , the  precise  analogy  of  the  Jacobian  M(e)  of  the  Kuhn-Tucker 
system  (2)  - (4)  , is 
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Under  the  given  assumptions,  it  follows  that  ^ exists  and  hence,  by 

the  implicit  function  theorem,  there  exists  a unique,  once  continuously 

differentiable  vector  function  y^(e,c)  = (x  (c,c) ,u(e,c) ,w (e,c),ij(e,c)) 

satisfying  Equations  (25)  - (28)  for  e near  0 and  c > c*  . 

2 

As  indicated  above,  V^0  is  positive  definite  at,  and  hence 

near  (x*,u*,w*, 0,c) . It  follows  that  x(e,c)  is  a locally  unique  local 
minimum  of  0(x,u(e,c) ,w(e,c) ,e,c)  for  e near  0 and  c > c*  . 

Observing  that  7^0  = 7^L  at  (x(e,c) ,u(e,c) ,w(e,c))  near  e = 0 , 

a comparison  of  (25)  - (28)  and  the  Kuhn-Tucker  system  (2)  - (4) , and  the 
uniqueness  of  the  solutions  of  both  systems.  Implies  that 
(x(e,c) ,u(e,c) ,w(e,c))  = (x(e) ,u(e) ,w(e) ) for  e near  0 , and  the  proof 
is  complete. 

It  may  also  be  noted  in  passing  that  0 H f at  (x(e,c) ,u(e,c) ,w(e,c)) 
for  e near  0 . Along  with  the  conclusions  given  in  the  last  part  of  the 
proof,  this  immediately  implies  the  results  that  precisely  parallel  those 
given  in  Lemma  2 for  the  optimal  value  function  and  its  first  and  second 
derivatives,  where  the  augmented  Lagrangian  0 replaces  the  usual  Lagranglan 
L in  the  given  expressions. 

Since  the  system  (25)  - (28)  is  once  continuously  differentiable  and 
identically  equal  to  zero  for  e near  0 and  c > c*  , it  can  be  differ- 
entiated with  respect  to  e to  yield 

M (e,c)  7 y (e,c)  = N (e,c)  (30) 

a e a a 

where  M (e,c)  is  defined  by  Equation  (29)  for  e near  0 and  c > c*  , 
a 
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V y (e,c) 
e a 


N (e.c) 
a 


V^x(e,c) 

V^u(e,c) 

V^w(e,c) 

V^u(e,c) 


2 

-7  0 

ex 

-UV^i 


Notice  first  that  Equation  (30)  yields  GV^u(e,c)  = 0 and  since 
G > 0 , it  follows  that  7 u(e,c)  = 0 as  expected  since  u, (e.c)  = 0 for 
i i B*(0)  . Let  M^(e,c)  , 7^y^(e,c)  and  N^(e,c)  be  the  portions  of 
(30)  excluding  the  nonbinding  inequality  constraints.  Then  with 


7 ^0  p'n 


Ma(ejc)  = 


and  letting 


[■r 


it  follows  that 


M^(e,c)"-"  = M^(e,c)' 


t;] 
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Letting 


we  have  that 


(31) 


The  elements  are  immediately  determined  using  a result  given 

in  Section  2.  Recall  the  computations  for  the  general  problem  when 
2 -1 

L exists  (Equations  (13)).  The  situation  here  is  Identical  with 

2 2 
0 replacing  L and  we  obtain 

Cii  = - P^(P  V^^0‘V)"^P  V^^0"^)  , 

Ci2  = = V^^0“V(P  V^^0"^p'‘’)~^  , (32) 

C22  * "(P  V^W)’^  . 


Following  the  development  in  Section  2 (in  particular,  using 
Equation  (18)),  the  estimate  of  the  first  order  sensitivity  of  the  Kuhn- 
Tucker  triple  for  Problem  P(esl  is  given  by 


V^x(e,c) 


V^u(e,c) 


7^w(e,c) 


-C„7  0 + 

11  ex  12 


-^2l'^ex  ^ ^ S2 
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5 . Equivalence  of  Sensitivity  Calculations- 

Since  y (e,c)  ^ y(r-)  near  e = 0,  we  must  have  V y (e,c)  h V y(r.) 
^ r a c 


For  t near  0 . 

The  structure  of  the  original  problem  and  the  augmented  Lagrangian 


can  be  used  to  demonstrate  explicitly  that  V y (e,c)  = V y(c)  . Recall 

e a 


that  for  the  given  Problem  P(e) , 


V^y(p.)  = M(e)  N(e) 


(18) 


where 


„ 2 

-V  L 

ex 

N(e)  = 

v^i 

It 

D 

< s 

^2 

-V  h 

A21 

^22 

__  e 

and  the  components  are  defined  by  Equations  (13),  (14),  or  (17) 


depending  on  which  conditions  apply. 

Differentiating  Equation  (20)  with  respect  to  c yields 
2 2 


r 

+ c Z 

i=l 

+ c 

r 

Z 

1=1 

P 

+ c Z 
3=1 

h.V  ^h. 

3 ex  j 

+ c 

P 

z 

3=1 

V h.\  h , 
X 3 e 3 

These  and  the  following  equations  are  evaluated  at  y (e,c)  for  e near  0 , 


and  hence,  g.(x(e,c),G)  = 0 , 1 c J = B*(0)  and  h^(x(e,c),e)  = 0 , 
3 ~ • 


This  yields 


2^  '2  T 

V 0 = V L - cP 
ex  ex 


V^g 


h 

e 


22 
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with  P defined  as  before.  An  explicit  expression  for  V y (e,c)  can 

C A 


now  be  derived  which  simplifies  (33) . Since 


C V ^0  = C 7 \ - cC„p'‘^ 

11  ex  11  ex  11 


V,i 

-7  h 
e 


and  since  (32)  implies  that 


; X 

V = 0’ 


it  follows  that 


Similarly, 


C.,7  ^0  = Ct.7  \ 

11  ex  11  ex 


= Sl^^ex'"  - 


and  since  (32)  implies  that 


we  obtain 


Si”  ■ 1 • 


" “ 

7 a 

r 

-7  h 
e 

'v' 

-7  h 

_ e _ 


The  first  order  sensitivity  of  the  Kuhn-Tucker  triple  may  then  be  written  as 

2 r 

-C„7  L + C,, 

11  ex  12  ^ 

e 


V y (e,c) 
e a 


e 


-1  P °T 

M (e.c)  ^ + 

Lo  <^U 


for  e near  0 . 
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Since  It  was  previously  concluded  that  V y(c)  = V y (e.c)  for 

e e a 


e near  0 and  c > c*  , it  must  be  true  that 

M(e) 

To  show  explicitly  that  this  relationship  holds,  note  that 


-1  -1 
:)  = M (e,c)  ^ + 

Lo  clj 


'[p  oj 


Pv^L  + 0?*^?  p'H 

I . J 

[:  ■:']  [: :’] 

[:  •3)E^ :’] 

rv^0  To  <n 

L:  .J  L.  4 

[; .:]) 


(definition) 


(using  (24)) 
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M(e)  (using  (11)) 


M(e)  . 


(35) 


Premultiplying  the  last  equation  by  M (e.c)  ^ and  post-multiplying  by 

d 

k(e)''^  yields  (35)  . ’ ' 
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6.  Conclusions  and  Extensions 

Under  the  given  conditions,  the  formulas  (32)  and  (34)  obtained  using 
the  augmented  Lagranglan  may  be  viewed  as  an  alternative  computational  device 
for  calculating  the  first  order  sensitivity  of  a Kuhn-Tucker  triple  for 
Problem  P(e),  encompassing  the  formula  (Equation  (18))  obtained  in  Section  2 
together  with  the  formulas  (13),  (14)  and  (17)  that  apply  In  the  various 
cases  specified.  That  Is,  using  the  augmented  Lagranglan  Instead  of  the 
usual  Lagranglan,  only  Case  1 (Section  2)  can  arise,  and,  hence,  only  the 
one  set  of  formulas  (32)  for  the  inverse  of  the  Jacobian  are  needed  under 
the  given  conditions.  Note  that  both  approaches  require  knowledge  of  the 
binding  Inequality  constraint  Indices  and  in  fact  require  the  determination 
of  a Kuhn-Tucker  triple.  In  addition.  Equations  (32)  and  (34)  require  a 

2 

value  of  c for  which  0 is  positive  definite.  Using  either  the  usual 

Lagranglan  or  the  augmented  Lagranglan,  the  Indicated  Information  permits 
an  exact  calculation  of  the  first  order  sensitivity  of  the  Kuhn-Tucker 
triple. 

The  requirements  for  exact  sensitivity  Information  are  rather  severe 
and  In  effect  result  in  "post-optimallty"  sensitivity  analysis  calculations. 

If  Inexact  sensitivity  information  Is  considered,  then  one  obvious  possibility 
Is  to  use  estimates  of  the  local  solution  and  its  associated  optimal  Lagrange 
multipliers  In  the  given  formulas.  (The  question  of  error-bounds  arises,  one 
that  we  do  not  pursue  here.  However,  for  certain  important  results  Is  this 
connection,  the  Interested  reader  is  referred  to  Robinson  (1973).)  With 
the  usual  Lagranglan  (Section  2)  problem-oriented  approach,  any  algorithm  could 
be  used  that  provides  such  estimates. 

The  augmented  Lagranglan  approach  Is  already  "algorithmic",  involving 
first  estimating  the  constant  c and  the  optimal  Lagrange  multipliers 
u(e,c)  , w(e,c)  for  Problem  P(e),  and  then  minimizing  0 over  x . Of 
course,  any  valid  unconstrained  minimization  algorithm  could  be  used,  as  could 
any  appropriate  procedure  for  estimating  the  optimal  multipliers.  In  this 
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regard,  from  Theorem  1 it  may  be  observed  that  If  u(E),w(e)  are  the  (locally) 
i unique  optimal  Lagrange  multipliers  for  Problem  P(e)  associated  with  x(e) 

I and  if  t is  sufficiently  close  to  0 , then 

I • (u(e,c) ,w(e,c))  = (u(e),w(e))  (u*,w*)  as  e 0 and  hence  the  locally 

unique  local  minimum  x(e,c)  of  0(x,u(e) ,w(e) ,c)  is  given  by 
* ^ 

x(e,c)  = x(e,c)  = x(e)  and  x(e)  ->  x(0)  = x*  . Clearly,  the  first  partial 
derivatives  (with  respect  to  e ) of  (x(e,c) ,u(e,c) ,w(e,c))  also  converge 
(component  by  component)  to  the  first  partial  derivatives  of 
(x(0) ,u(0) ,w(0))  = (x*,u*,w*)  . Actually,  any  procedure  that  determines  the 
optimal  multipliers  of  Problem  P(0)  as  e ->  0 can  be  used.  Unconstrained 
minimization  of  the  augmented  Lagranglan  (in  the  appropriate  neighborhood) 
will  then  yield  an  estimate  of  the  local  solution  x(0)  of  Problem  P(0) , 
and  the  formulas  given  in  (32)  and  (34)  can  be  used  to  calculate  the 
corresponding  estimates  of  the  first  partial  derivatives  of  the  Kuhn- Tucker 
triple  (x(0) ,u(0) ,w(0))  of  Problem  P(0). 

An  alternative  to  the  above  procedures  for  estimating  the  desired 

• sensitivity  information  is  the  penalty  function  procedure  mentioned  in  the 
Introduction  and  developed  rather  extensively  by  Flacco  (1976)  and  Armacost 

• and  Flacco  (1974),  (1975),  (1976a)  in  terms  of  a well-known  logarithmic- 
quadratic  penalty  function.  It  is  shown  that  that  local  unconstrained 
minimization  of  the  penalty  function  yields  an  estimate  of  the  Kuhn-Tucker 
triple  under  the  same  conditions  assumed  throughout  this  paper.  Also, 
analogous  to  the  augmented  Lagranglan  result,  the  penalty  function  Hessian 
is  shown  to  be  positive  definite  in  the  appropriate  neighborhood,  so  that 
one  set  of  formulas  (analogous  to  (32)  and  (34))  suffice  to  calculate  the 
partial  derivatives  of  the  Kuhn-Tucker  triple.  There  is  no  need  to  make  a 
prior  calculation  of  the  optimal  Lagrange  multipliers  or  of  any  other 
information,  only  the  unconstrained  minimizing  points  (having  preset  a 
scalar  parameter)  normally  required  by  the  algorithm,  to  estimate  the  triple 

. and  its  derivatives.  Thus,  the  required  effort  is  comparatively  modest, 

with  respect  to  the  Lagranglan  and  augmented  Lagranglan  calculations. 
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and  makes  this  procedure  appealing  as  a pre-optimality  sensitivity  analysis 
estimation  technique.  This  appeal  is  somewhat  offset  by  the  typical  Ill- 
conditioning  that  characterizes  the  penalty  function  Hessian  near  a local 
solution  of  the  given  problem.  Thus,  all  the  indicated  approaches  involve 
compensating  factors,  each  offering  advantages  and  disadvantages. 

As  a final  observation,  the  expression  developed  above  for  V y (e,c) 

in  Equations  (32)  and  (34)  can  be  placed  in  precise  correspondence  with  the 
sensitivity  expressions  obtained  by  Buys  and  Gonin  (1975)  in  their  Equations 
(11)  and  (12).  Our  method  of  proof  is  simpler,  however,  utilizing  as  shown 
a result  previously  obtained  for  the  usual  Lagranglan.  Further,  the  rela- 
tionship between  the  Lagranglan  and  augmented  Lagranglan  calculations  is 
demonstrated  explicitly  and  allows  application  of  all  prior  sensitivity 
results  involving  the  usual  Lagranglan. 

Although  a specific  augmented  Lagranglan  function  was  used  above  to 
obtain  the  sensitivity  results,  the  analysis  and  analogous  results  obtain 
for  a more  general  function,  such  as  the  Arrow,  Gould  and  Howe  (1973) 
"extended"  Lagranglan,  which  encompasses  most  of  the  popular  functions  of 
the  augmented  type.  A more  general  concept,  an  "Acceptable  Sequential 
Algorithm,"  (A^A)  was  proposed  by  Fiacco  and  developed  by  Armacost  (1976b). 
It  is  noted  thit  most  popular  penalty  and  barrier  functions  qualify  as  an 
ASA,  so  that  sensitivity  results  anologous  to  those  developed  here  can  be 
obtained  and  will  provide  estimates  of  the  sensitivity  of  the  optimal  value 
function  and  the  Kuhn- Tucker  triple. 
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